Skip to main content

History of Astronomy: Part 2

Previously, we discussed the history of astronomy, focusing on developments prior to Newtonian mechanics. In this section, we will continue our exploration of the history of astronomy, focusing on developments from the time of Newton to the present day.

Table of Contents

Newton's Work

Isaac Newton (1643–1727) built upon the work of Galileo and others to develop a comprehensive theory of motion and gravitation. While the Plague ravaged Europe in 1665, Newton retreated to his family estate in Woolsthorpe, where he made several groundbreaking discoveries. His Philosophiae Naturalis Principia Mathematica (Mathematical Principles of Natural Philosophy), published in 1687, laid the foundation for classical mechanics. In the Principia, Newton formulated his three laws of motion, the law of universal gravitation, as well as differential and integral calculus (independently developed by Leibniz). He also published his work on optics in Optiks in 1704, where he described the nature of light and color.

Newton was one of the most intelligent and influential scientists in history. When faced with the Brachistochrone problem, he solved it in a single night, inventing the calculus of variations in the process. By Newton's request, his solution was published anonymously in the 1696 edition of the Journal des Sçavans. Bernoulli, who had posed the problem, recognized Newton's solution and famously remarked, "I recognize the lion by his claw."

Newton's law of gravitation applies to planets (spheres) with spherical symmetry. This can be proven by either integrating over the volume of the sphere or by using Gauss's law for gravity. The latter provides a very elegant proof, but did not exist until the developments of Carl Friedrich Gauss in the 19th century.

We should be vehemently familiar with Newton's laws of motion and gravitation, as they form the basis of classical mechanics. Nevertheless, it is worth briefly reviewing them here. We will do so by deriving Kepler's laws of planetary motion from Newton's law of gravitation. We must first establish the necessary mathematical tools.

Center-of-Mass Reference Frame

In a system of particles, the center of mass (COM) is the weighted average position of all the particles, where the weights are given by their masses. For a system of particles with masses and positions , the position of the center of mass is given by

where is the total mass of the system.

If we rearrange and differentiate this equation, we have

where is the total momentum of the system. Differentiating again, we have

Anyways, suppose we have a binary system of two particles with masses and , and positions and . Let their center of mass be the origin, i.e., . Let be the relative position vector from particle 1 to particle 2, i.e., . Each particle can then be expressed in terms of the center of mass and the relative position vector as follows:

If we define the reduced mass as

then each particle's position is

There are a few useful identities that we can derive from this. First, Newton's third law gives , so if we define the relative acceleration , then

where is the force on particle 1 due to particle 2. This means

This means that we can treat the two-body problem as a one-body problem with mass moving under the influence of the force . Second, the total kinetic energy of the system is

where is the relative velocity. Third, the total angular momentum of the system about the center of mass is

The gravitational potential energy of the system is

Kepler's First Law

We shall use the center-of-mass frame, and begin by considering the torque on the system

The second term is zero, since . The first term is also zero, since the gravitational force is central, i.e., . Thus, , so the angular momentum is conserved.

Anyways, we can also write the angular momentum as

Next, consider the acceleration of the vector . We have

so the cross product of with is

Using the vector triple product identity , we have

We can rewrite the left-hand side as since is constant, so

Integrating both sides with respect to time, we have

with a constant vector of integration.

It is easy to see that is within the plane of the orbit, so is also within the plane of the orbit. Also, consider when the magnitude of both sides is greatest. For the left-hand side, this occurs when the velocity is greatest, which is at perihelion. For the right, this occurs when is parallel to . Thus, must point towards the perihelion.

Taking the dot product of both sides with , we have

Using the scalar triple product identity , we have

where is the angle between and . But , so

which rearranges to

This is the equation of a conic section in polar coordinates, with one focus at the origin. If we define the eccentricity , then we can rewrite this as

where is the semi-major axis of the ellipse.

Kepler's first law is thus proven: the orbit of a planet is an ellipse. One caveat is that the focus is at the center of mass, not the Sun which Kepler assumed. Kepler could not have known this, for the center of mass is so close to the Sun that it is nearly indistinguishable from it.

Anyways, we can arrange the two equations for and to find that

When , the orbit is circular, and , which is the maximum angular momentum for a given semi-major axis. When , the orbit is parabolic, and .

Kepler's Second Law

For the second law, we need to understand how to integrate in polar coordinates. The area element in polar coordinates is , so given a curve , the area enclosed by the curve and the origin is

An infinitesimal change in area is thus

and its time derivative is

As the term is the tangential velocity , we have

Finally, by definition, and are orthogonal, so is the magnitude of the cross product . But that is just the angular momentum divided by the reduced mass, so

Notice that is constant (as we proved earlier), so is also constant. This does it: a line segment joining a planet and the Sun sweeps out equal areas during equal intervals of time.

Kepler's Third Law

Historically speaking, Newton's law of gravitation was formulated to explain Kepler's third law. We can now demonstrate that Newton's law of gravitation does indeed lead to Kepler's third law. Begin by integrating Kepler's second law over one full orbit (with period ):

The area of an ellipse is , where and are the semi-major and semi-minor axes, respectively. Thus we have

From the equation of an ellipse, we have . Also, from earlier, we have . Substituting these into the equation for , we have

Finally, squaring both sides, we have

which is Kepler's third law: the square of the orbital period is proportional to the cube of the semi-major axis. Notice that according to this formulation, the square is also inversely proportional to the total mass of the system, which is an observation Kepler could not have made. He referenced the data of Brahe, whose data involved the planets orbiting the Sun, whose mass is so much greater than that of any planet that it is effectively constant.

Virial Theorem

One important result that we will need is the virial theorem. For a stable, bound system of particles interacting through a potential , the virial theorem states that

where is the total kinetic energy of the system, and the angle brackets denote a time average. (Of course, this is only true for potentials that are homogeneous functions of degree , such as the gravitational potential.) The angle brackets can be formally defined as a functional of the form

where is the time period over which the average is taken.

A homogeneous function of degree is a function that satisfies the property

We can choose a potential of the form , where is a constant. This obviously satisfies the property of a homogeneous function of degree , so our form of the virial theorem applies.

We will define a new term, the virial , as

where and are the momentum and position of the -th particle, respectively. This term is useful because its time derivative is related to the kinetic and potential energies of the system. Taking the time derivative of , we have

where is the force on the -th particle, and we used the fact that and . The second term is just twice the total kinetic energy of the system, so

On the other hand, we can also write the time derivative of as

Here, is the moment of inertia of the system about the origin. Notice the careful use of the chain rule here; . Combining the two expressions for , we have

The term on the right is known as the virial of Clausius. Clausius (1822–1888) was a German physicist and mathematician who made significant contributions to the field of thermodynamics. He is best known for formulating the second law of thermodynamics and introducing the concept of entropy. He also made important contributions to the kinetic theory of gases and the study of heat transfer.

Next, we need to consider all the forces acting on each particle in the system. Let denote the force on particle due to particle . Then, the total force on particle is

Substituting this into the previous equation, we have

We also have , so

For the rightmost term, due to Newton's third law, we have , so the term vanishes from symmetry. On the other hand, the left term is not zero because and are parallel (as the forces are central). In our context, the force is the gravitational force

where and . The unit vector points from particle to particle . Thus, we have

where is the total gravitational potential energy of the system. Substituting this back into our earlier equation, we have

We can take the time average of both sides over a long time period . If the system is stable and bound, then does not diverge as , so the term averages to zero. All periodic systems are stable and bound, so this applies to planetary systems. Thus, we have

This is the virial theorem, specifically for a that is a homogeneous function of degree . If the degree was different, we would have instead. As a corollary, the time average of the total energy of the system is

Light and Optics

Another field that Newton made significant contributions to was optics. In his book Optiks, published in 1704, Newton described his experiments with light and color. He demonstrated that white light is composed of a spectrum of colors, which can be separated using a prism. He also proposed the particle theory of light, suggesting that light is made up of tiny particles called "corpuscles." This theory was later challenged by the wave theory of light, which was supported by experiments such as Thomas Young's double-slit experiment in 1801.

Previously, we discussed how Galileo used a refracting telescope to make astronomical observations. Over the years, the mathematical and physical understanding of optics improved, leading to the development of more advanced telescopes.

The focal plane of a lens or mirror is the plane where light rays converge to form an image. In astronomy, the focal plane is where the image of a celestial object is formed by the telescope's optics. We assume that celestial objects are located practically at infinity, so the light rays entering the telescope are parallel. If such rays approach a lens at an angle relative to the optical axis, they will be focused to a point in the focal plane at a distance from the optical axis, where is the focal length of the lens. For small angles, we can approximate (in radians), so the distance from the optical axis in the focal plane is approximately

This relationship is also captured by the plate scale, which is a differential relation given by

What does this mean? If we increase the focal length , then increases, meaning that a small change in angle corresponds to a larger change in position in the focal plane. This is desirable in astronomy, as celestial objects are often very far away, resulting in very small angular separations. By increasing the focal length, we can spread out the image in the focal plane, allowing for better resolution and detail.

Stellar Parallax and the Magnitude System

In their original forms, Kepler's laws of planetary motion were based on the relative sizes of the orbits, for their absolute sizes were not known. In 1761, however, the first successful measurement of the distance to Venus was made during a transit of Venus across the Sun (This is credited to Jean‑Baptiste Chappe d’Auteroche.). By observing the transit from different locations on Earth, astronomers were able to use parallax to calculate the distance to Venus and, consequently, the distance from Earth to the Sun (the astronomical unit, or AU). In 1838, Friedrich Bessel made the first successful measurement of stellar parallax, determining the distance to the star 61 Cygni. This parallax effect is extremely useful; our eyes use it to perceive depth, and astronomers use it to measure distances to nearby stars.

If the distance from the Sun to a planet is then, and the lines (blue and yellow in the figure above) are perpendicular, the parallax angle is given by

from simple trigonometry, where is the parallax angle.

As one radian is , we have

and we define the parsec (pc, parallax-second) as the numerator AU, so that

As the distances to stars are so vast, the parallax angles are extremely small. This was why it took so long to notice the effect. Modern instruments are being developed to measure parallax angles. For instance, NASA's planned Space Interferometry Mission (SIM) aims to measure parallax angles with a precision of 1 microarcsecond, allowing for distance measurements to stars up to 10,000 light-years away.

Magnitude Scale

Other than distance, another important property of stars is their brightness. Hipparchus, who we met earlier, developed the first known magnitude scale for classifying the brightness of stars. It was quite primitive, using the naked eye to classify stars into six categories, with being the brightest and being the faintest visible to the naked eye. This quantity is known as the apparent magnitude of a star. Primitive it was, it was nevertheless used as the basis for more recent magnitude scales.

For instance, in 1856, Norman Pogson proposed a logarithmic scale for magnitudes, where a difference of 5 magnitudes corresponds to a brightness ratio of 100. In other words, a star is 100 times brighter than a star, and times brighter than a star. This value is known as Pogson's ratio. Additionally, the scale was extended to include negative magnitudes for very bright objects, such as the Sun () and the full Moon ().

To further understand the magnitude scale, we need to understand how we measure brightness. Suppose we have a detector with an effective area that is a distance away from a star emitting a total power (luminosity) . The radiant flux (or irradiance) is defined as the power received per unit area, i.e.,

To get a more useful quantity, pretend that we have completely surrounded the star with a sphere of radius . It then emits a total power through the surface area of the sphere, which is . The radiant flux at the surface of the sphere is then

Notice that since the flux is a function of distance, it does not matter what the shape of the detector is or its area; the flux is the same everywhere on the sphere. This function is known as the inverse-square law, and it applies to any point source of radiation in three-dimensional space.

Now we can define the absolute magnitude of a star as the apparent magnitude it would have if it were located at a standard distance of 10 parsecs (pc) from Earth. Mathematically, we have from Pogson's scale that

where and are the fluxes (brightnesses) of the two stars. Equivalently,

There are two important cases to consider. First, consider two stars of the same luminosity but at variable distances. If we let be the apparent magnitude of that star at distance , and be its absolute magnitude at distance pc, then we have

where is the flux at distance , and is the flux at distance pc. This allows us to relate the apparent and absolute magnitudes of a star. We have, from the inverse-square law, that

Combining these two equations, we have

The quantity is known as the distance modulus, and it provides a way to estimate the distance to a star based on its apparent and absolute magnitudes.

The second case is when we have two stars at the same distance but with different luminosities. If we let one of the stars be the Sun with absolute magnitude and luminosity , and the other be a star with absolute magnitude and luminosity , then we have

We can express the luminosities in terms of their radiant fluxes at a standard distance of 10 pc. Using the first half of Equation , and noting , we have

Similarly, we have , so we have

Light and Wave Phenomena

Newton, as he stated in Optiks, believed that light was made up of tiny particles called "corpuscles." One justification he gave was that shadows had sharp edges, which would be difficult to explain if light were a wave. This theory was later challenged by the wave theory of light, which was supported by experiments such as Thomas Young's double-slit experiment in 1801. We now know that light is an excitation of a quantized electromagnetic field, and it exhibits both wave-like and particle-like properties. The wave-like properties come from oscillatory terms in field operators, while the particle-like properties originate from creation and annihilation operators. Anyways, we will briefly review how humans came to understand the wave nature of light.

Initially, light was thought to travel instantaneously, as no delay could be detected by the naked eye. In 1676, Ole Rømer made the first quantitative estimate of the speed of light by observing the eclipses of Jupiter's moons. He noticed that the observed times of the eclipses varied depending on the distance between Earth and Jupiter. By analyzing these variations, he estimated the speed of light to be approximately , which is about 25% lower than the actual value.

In modern times, the speed of light in a vacuum is a defining constant in the International System of Units (SI). It is exactly , and it is denoted by the symbol . The meter is now derived using the speed of light, defined as the distance light travels in a vacuum in of a second.

Advancing further, Christian Huygens (1629–1695) proposed the wave theory of light in the late 17th century. He suggested that light propagates as a wave, similar to sound or water waves. As such, they had the same mathematical description with quantites like wavelength , frequency , and speed related by the equation

Huygens' principle states that every point on a wavefront can be considered a source of secondary wavelets, which spread out in all directions at the same speed as the wave itself. The new wavefront is then the envelope of these secondary wavelets.

One demonstration of the wave nature of light came from Thomas Young's double-slit experiment in 1801. In this experiment, light from a single source passes through two pinholes (nowadays we use slits) and creates an interference pattern on a screen behind the slits. The pattern consists of alternating bright and dark fringes, which can be explained by the constructive and destructive interference of the light waves emanating from the two slits.

If the distance between the slits is , the distance from the slits to the screen is , then at an angle from the central axis, the path difference between the two waves is . Constructive interference (bright fringes) occurs when the path difference is an integer multiple of the wavelength, i.e.,

whereas destructive interference (dark fringes) occurs when the path difference is an odd multiple of half the wavelength, i.e.,

This can also be extended to multiple slits, which is the basis for diffraction gratings used in spectroscopy. In a diffraction grating with slits, the condition for constructive interference becomes

where is now the distance between adjacent slits.

Rayleigh Criterion

We return to the discussion of telescopes. In addition to being able to focus light to a point, a telescope must also be able to resolve two closely spaced objects. By "resolve," we mean being able to distinguish the two objects as separate entities, rather than a single blurred object, especially when they are very close together in the sky. The ability to resolve two objects is limited by diffraction, which causes light waves to spread out as they pass through an aperture, such as the lens or mirror of a telescope.

To explain how single-slit diffraction occurs, consider a slit of width illuminated by monochromatic light of wavelength . According to Huygens' principle, every point within the slit acts as a source of secondary wavelets. These wavelets interfere with each other, leading to a diffraction pattern on a screen placed far away.

Suppose one ray travels straight through the center of the slit to a point on the screen at an angle . Another ray travels from the top edge of the slit (a distance from the center) to the same point on the screen. The condition for destructive interference (a dark fringe) between these two rays is that the path difference between them is equal to half a wavelength, i.e.,

We can repeat this argument for rays originating from other points within the slit, leading to the general condition for destructive interference:

Computer-generated Airy pattern

Computer-generated Airy pattern created by a circular aperture. The central bright region is known as the Airy disk, surrounded by concentric rings of decreasing brightness. The first dark ring occurs at an angle where , which defines the Rayleigh criterion for resolution. By Sakurambo at English Wikipedia 桜ん坊 - Created in Adobe Illustrator and then edited to incorporate gradient stops calculated in Chipmunk Basic (see below for source code)., Public Domain, Link

If we have a circular aperture (like in most telescopes), by symmetry, the diffraction pattern will be circularly symmetric. To analytically derive the diffraction pattern, we will have to double integrate over the circular aperture. This will come later, but for now we just need to know that Sir George Biddell Airy (1801–1892) worked out the mathematics in 1835. It is for this reason that the central bright region of the diffraction pattern is known as the Airy disk. The equation is similar to the single-slit case with , but this time, takes on different values.

Ring Value (intensity ratio)
Central maximum
First minimum
Second maximum
Second minimum
Third maximum
Third minimum

(Table taken from Carroll & Ostlie, Table 6.1.)

Now the key point is this: suppose we have two point sources of light (e.g., two stars) separated by a small angle . Each point source will produce its own Airy disk on the image plane of the telescope. They will superimpose to form a combined diffraction pattern. If the two sources are very close together, their Airy disks will overlap significantly, making it difficult to distinguish them as separate sources.

Different superimposed Airy patterns

Different superimposed Airy patterns. The leftmost pattern shows two Airy disks that are well separated, allowing for clear resolution of the two sources. The middle pattern shows two Airy disks that are closer together, with their first dark rings overlapping, making it more challenging to resolve them. The rightmost pattern shows two Airy disks that are so close that they merge into a single blurred pattern, making it impossible to distinguish the two sources.

Source

Can we quantify when two point sources can be resolved? Consider when two sources are close together such that they are not clearly separated. This means that the angular separation is small. We define the criterion to be that the central maximum of one Airy disk coincides with the first minimum of the other. From the Airy disk condition, we have

This is known as the Rayleigh criterion for resolution, named after Lord Rayleigh (John William Strutt, 1842–1919), who formulated it in 1879. Notice that the minimum resolvable angle decreases as the aperture diameter increases, and increases with increasing wavelength . This means that larger telescopes can resolve finer details, and shorter wavelengths of light (e.g., blue light) can resolve finer details than longer wavelengths (e.g., red light).

Real-life observations do not always achieve the theoretical limit set by the Rayleigh criterion. This is due to many factors, one of which is the turbulence in Earth's atmosphere, which causes the light from celestial objects to be distorted as it passes through the atmosphere. This effect is known as astronomical seeing. To mitigate this, astronomers use techniques such as adaptive optics, which involve using deformable mirrors to correct for atmospheric distortions in real-time. Additionally, space-based telescopes, such as the Hubble Space Telescope, avoid atmospheric effects altogether by operating above Earth's atmosphere.

Also, as we know, the index of refraction depends on the wavelength of light, leading to chromatic aberration in lenses. This means that different colors of light are focused at different points, causing blurring and color fringing in images. To reduce chromatic aberration, we can introduce correcting lenses made of different types of glass with varying dispersion properties.

Electrodynamics and the EM Spectrum

Prior to the 19th century, there were three interesting phenomena known to physicists: electricity, magnetism, and light. It was not until the work of James Clerk Maxwell (1831–1879) that these phenomena were unified into a single theory known as electromagnetism. Maxwell's equations, published in the 1860s, describe how electric and magnetic fields interact and propagate through space.

As we have seen multiple times, Maxwell's equations predict the existence of electromagnetic waves that propagate at the speed of light. Heinrich Hertz (1857–1894) experimentally confirmed the existence of these waves in the late 1880s. He produced and detected radio waves in the laboratory, demonstrating that they exhibited the same properties as light, such as reflection, refraction, and polarization. Unfortunately, this work was only done ten years after Maxwell's death, so he did not live to see his predictions confirmed.

John Henry Poynting (1852–1914) further developed the theory of electromagnetism by describing the energy flow in electromagnetic fields. He introduced the Poynting vector , which represents the directional energy

equal to the energy per unit area per unit time (power per unit area) carried by an electromagnetic wave. Its time average gives the average power flow, which in a vacuum is

where and are the amplitudes of the electric and magnetic fields, respectively, and is the unit vector in the direction of wave propagation. Knowing that energy exists in electromagnetic waves, it is natural to deduce that light can cause physical effects. Suppose light is completely absorbed by a surface. The equivalent force due to radiation pressure is

where is the area of the surface. If the light is completely reflected, the force is doubled, i.e.,

This effect is extremely small for everyday light intensities, but it can be significant in astrophysical contexts, such as the pressure exerted by sunlight on comet tails or the concept of solar sails for spacecraft propulsion. When we study early main-sequence stars, accelerating particles, and other high-energy astrophysical phenomena, we will see that radiation pressure can be a dominant force.

Radiation and the Birth of Quantum Mechanics

As previously discussed, Maxwell's equations predict the existence of electromagnetic waves that propagate at the speed of light. These waves can have a wide range of frequencies and wavelengths, forming the electromagnetic spectrum.

When an object is heated, the average kinetic energy of its atoms and molecules increases, causing them to vibrate more vigorously. The vibration of charged particles (such as electrons) produces electromagnetic radiation. The spectrum of this radiation depends on the temperature of the object. This phenomenon was first discovered by Thomas Wedgewood (1771–1805) in the late 18th century, who observed that heated objects emit light.

A perfect blackbody is an idealized object that absorbs all incident radiation, regardless of frequency or angle of incidence. When a blackbody is heated, it emits radiation with a characteristic spectrum that depends only on its temperature. To mathematically describe this phenomenon, German physicist Wilheim Wien (1864–1928) proposed Wien's displacement law in 1893, which states that the wavelength at which the emission of a blackbody spectrum is maximized is inversely proportional to its temperature :

where is a proportionality factor known as Wien's displacement constant. Another law that was important was the Stefan-Boltzmann law, named as such because it was first derived empirically by Josef Stefan (1835–1893) in 1879 and later derived theoretically by Ludwig Boltzmann (1844–1906) in 1884. It states

where is the total power radiated by a blackbody, is its surface area, is its absolute temperature, and is the Stefan-Boltzmann constant. In astrophysics, stars are not perfect blackbodies, so we actually use the law to define an effective temperature of a star, which is the temperature a perfect blackbody would need to have to emit the same total power per unit area as the star.

By the end of the 19th century, physicists and astronomers at the time believed that they had a complete understanding of the laws of physics. Newtonian mechanics and Maxwell's electromagnetism were well-established theories that explained what seemed to be all physical phenomena. It is worth noting how remarkably successful these theories were despite what we now know to be their limitations. This Newtonian paradigm was built upon centuries of scientific progress, some of which we have discussed in this section. We have seen the works of ancient Greek philosophers like Aristotle and Ptolemy, the revolutionary ideas of Copernicus, Galileo, and Kepler during the Renaissance, and the groundbreaking contributions of Newton, Maxwell et al. in the 17th century. By the late 19th century, the scientific community believed that they had a complete understanding of the laws of physics, and many physicists thought that only minor details remained to be discovered.

However, several experimental results in the late 19th and early 20th centuries challenged this Newtonian paradigm and led to the development of new theories. One of the most significant challenges came from the study of blackbody radiation. Physicists realized that there was no theoretical basis for the exact shape of the blackbody spectrum. Lord Rayleigh (1842–1919) attempted to derive the spectrum using electromagnetic theory and classical statistical mechanics. Specifically, he considered a cavity with perfectly reflecting walls, which contains standing electromagnetic waves. Let be the length of each side of the cavity.

As they are standing waves, the permitted wavelengths are , or more generally for . Each wavelength would receive energy equal to from the equipartition theorem, where is Boltzmann's constant. This eventually led to the Rayleigh-Jeans law

where is the spectral radiance of the blackbody at temperature . This law agreed with experimental results at long wavelengths (low frequencies) but diverged significantly at short wavelengths (high frequencies), leading to what was known as the "ultraviolet catastrophe." Another law that was proposed was Wien's law, which was an empirical fit to the blackbody spectrum at short wavelengths. It is given by

where and are constants determined experimentally. Wien's law agreed with experimental results at short wavelengths but failed at long wavelengths.

The resolution to this problem came from Max Planck (1858–1947) in 1900. Planck proposed that Wien's law could be slightly modified to fit the entire blackbody spectrum:

This is the important part. To identify the constants and , Planck made a bold assumption: he proposed that a wave could only have specific energies that were integer multiples of a minimum energy (a quantum of energy) given by

where is now known as Planck's constant. While he initially sought to set to recover classical physics, he found that this was not possible if he wanted to fit the experimental data. Eventually, he derived Planck's law for blackbody radiation:

Equivalently, as a function of frequency, it is

This law agreed perfectly with experimental results across all wavelengths and temperatures.

Let's explore how this can be used for a star of radius and temperature . The quantity is the spectral radiance, which is the power emitted per unit area per unit solid angle per unit wavelength. A solid angle is defined in spherical coordinates as

where is the polar angle (measured from the -axis) and is the azimuthal angle (measured in the -plane from the -axis). As is per wavelength, area, and solid angle, the infinitesimal power emitted per unit area in a wavelength interval , area , and solid angle is given by

The term arises because the effective area "seen" by the radiation is reduced by a factor of when the radiation is emitted at an angle from the normal to the surface. Now, to get the total power emitted per unit area over all directions for a wavelength interval , we have

Plugging in Planck's law, we have

known as the monochromatic luminosity of the star, for the wavelength is in an infinitesimal interval so the color does not change. The flux for monochromatic luminosity is known as the monochromatic flux, given via the inverse-square law as

where is the distance from the star to the observer. Anyways, integrating Planck's law over all wavelengths, we have the total luminosity of the star:

recovering the Stefan-Boltzmann law, where we have used the fact that . We can prove this integral with complex analysis, as we have done before, but we will not do so here. Anyways, we can see that the constant is given by

This derivation shows how Planck's law leads to the Stefan-Boltzmann law and provides a theoretical basis for the effective temperature of stars.

Color and Spectral Classification

With the understanding of blackbody radiation and Planck's law, astronomers could now relate the color of a star to its temperature. An important question was how to classify, measure, and compare the light from different stars.

The magnitude can be classified using the magnitude system we discussed earlier. However, this system does not account for the color of the star. One idea is to use filters to isolate specific wavelength ranges and measure the brightness in those ranges. For example, the UBV photometric system uses three filters: U (ultraviolet), B (blue), and V (visual). The Johnson-Cousins UBVRI system extends this to include R (red) and I (infrared) filters.

We can measure each star's apparent magnitudes physically by observing them through these filters. A filter is essentially a piece of glass or plastic that only allows light of certain wavelengths to pass through. Historically, astronomers used photographic plates with different emulsions to create filters that were sensitive to specific wavelength ranges. Nowadays, we use more advanced materials and technologies to create filters with precise transmission properties.

The difference in magnitudes between two filters is known as a color index. For instance, the color index is defined as the difference between the apparent magnitudes measured through the B and V filters:

When we measure the apparent and absolute magnitudes over all wavelengths, the result is known as the bolometric magnitude. However, since we cannot observe all wavelengths (for instance, ultraviolet and infrared light are absorbed by Earth's atmosphere), we often use the visual magnitude and absolute visual magnitude , which are measured using the V filter. To account for the difference between the bolometric and visual magnitudes, we introduce the bolometric correction , defined as

Next, we want to relate the apparent magnitudes measured through different filters to the other physical properties of stars. A star's ultraviolet (U) magnitude, for instance, can be written as

where is a sensitivity function that describes how the U filter responds to different wavelengths, and is a constant that depends on the calibration of the photometric system. Such a constant also exists when we measure the bolometric magnitude. A bolometer (a device that measures the total power of incident electromagnetic radiation), in ideal conditions, would measure the total flux from a star without any wavelength dependence. Thus we have for all , and the bolometric magnitude is given by

The value for is arbitrary, much like how potential energy is defined up to an additive constant. Historically, as the bolometric correction is given by , astronomers have chosen to set such that is negative for all stars while remaining as small as possible. This should make sense; there is a small amount of ultraviolet and infrared light that is not captured by the V filter, and the sign is negative because magnitudes are defined such that brighter objects have lower (more negative) magnitudes.

The color index can be calculated as

If we plug in the monochromatic flux from Equation , we have

In other words, the distance and radius of the star cancel out, and the color index only depends on the temperature of the star and the properties of the filters used.

Astronomers can use this relationship to estimate the temperature of a star based on its color index. If we plot the color index against the color index for a large number of stars, we obtain a diagram known as the color-color diagram. This diagram reveals a clear correlation between the two color indices, which can be used to classify stars based on their temperatures.

Summary and Next Steps

In this section, we have explored the historical development of our understanding of stellar parallax, the magnitude system, and the wave nature of light. We have seen how the work of astronomers and physicists over the centuries has led to the modern theories of electromagnetism and quantum mechanics. We have also discussed how these theories have been applied to understand the properties of stars, such as their temperatures and luminosities.

Once again, here are the key takeaways from this section, labeled based on who discovered or developed them. This is not necessarily microscopically chronological; rather, it is grouped by the main contributors to each idea.

  • Isaac Newton (1643–1727): Formulated the laws of motion and universal gravitation, laying the foundation for classical mechanics. Developed the reflecting telescope and made significant contributions to optics.
  • Jean-Baptiste Chappe d’Auteroche (1728–1769): Made the first successful measurement of the distance to Venus during a transit of Venus across the Sun (1761).
  • Friedrich Bessel (1784–1846): Made the first successful measurement of stellar parallax (61 Cygni, 1838).
  • Norman Pogson (1829–1891): Proposed the logarithmic scale for stellar magnitudes, establishing the modern magnitude system (1856).
  • Thomas Young (1773–1829): Conducted the double-slit experiment, demonstrating the wave nature of light (1801).
  • Sir George Biddell Airy (1801–1892): Worked out the mathematics of diffraction through a circular aperture, leading to the concept of the Airy disk and the Rayleigh criterion for resolution (1835).
  • Lord Rayleigh (1842–1919): Contributed to the Rayleigh criterion for resolution, and attempted to derive the blackbody radiation spectrum using classical physics (late 19th century).
  • Ole Rømer (1644–1710): Made the first quantitative estimate of the speed of light by observing the eclipses of Jupiter's moons (1676).
  • Christian Huygens (1629–1695): Proposed the wave theory of light and Huygens' principle, explaining the propagation of light waves (late 17th century).
  • James Clerk Maxwell (1831–1879): Formulated Maxwell's equations, unifying electricity, magnetism, and optics into electromagnetism (1860s).
  • Heinrich Hertz (1857–1894): Conducted experiments that confirmed the existence of electromagnetic waves, paving the way for the development of radio and wireless communication (1887).
  • John Henry Poynting (1852–1914): Introduced the Poynting vector, describing the energy flow in electromagnetic fields (1884).
  • Wilhelm Wien (1864–1928): Formulated Wien's displacement law, relating the temperature of a blackbody to the wavelength at which it emits radiation most intensely (1893).
  • Josef Stefan (1835–1893): Empirically derived the Stefan-Boltzmann law, relating the total energy radiated by a blackbody to the fourth power of its temperature (1879).
  • Ludwig Boltzmann (1844–1906): Theoretically derived the Stefan-Boltzmann law using thermodynamics and statistical mechanics (1884).
  • Max Planck (1858–1947): Proposed the quantization of energy and formulated Planck's law of blackbody radiation, laying the groundwork for quantum mechanics (1900).

In the next part of the history of astronomy, we are moving into the 20th century, where we will explore the development of quantum mechanics, general relativity, and other modern theories that have shaped our understanding of the universe.